Skip to content

Conversation

broxigarchen
Copy link
Contributor

@broxigarchen broxigarchen commented Oct 6, 2025

This is a follow up patch from #161764

This

  1. revert an error in previous patch that the v_mov_b16_t16_e32 should still select idx 1 while only e64 should select idx 2
  2. revert the utility function added in previous patch
  3. added a src modifier check in isFoldCopyable check for v_mov_b16_t16_e64

@broxigarchen broxigarchen marked this pull request as ready for review October 6, 2025 15:00
@broxigarchen broxigarchen requested review from arsenm and Sisyph October 6, 2025 15:00
@llvmbot
Copy link
Member

llvmbot commented Oct 6, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Brox Chen (broxigarchen)

Changes

This is a follow up patch from #161764

This

  1. fix a error that the v_mov_b16_t16_e32 should still select idx 0
  2. remove the utility function added in previous patch
  3. added a src modifier check in isFoldCopyable check for v_mov_b16_t16_e64

Full diff: https://github.com/llvm/llvm-project/pull/162101.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+2-27)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (-1)
diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index 90c828ba8dfab..5d34cdbbf20c5 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -932,7 +932,7 @@ static MachineOperand *lookUpCopyChain(const SIInstrInfo &TII,
   for (MachineInstr *SubDef = MRI.getVRegDef(SrcReg);
        SubDef && TII.isFoldableCopy(*SubDef);
        SubDef = MRI.getVRegDef(Sub->getReg())) {
-    unsigned SrcIdx = TII.getFoldableCopySrcIdx(*SubDef);
+    const int SrcIdx = MovOp == AMDGPU::V_MOV_B16_t16_e64 ? 2 : 1;
     MachineOperand &SrcOp = SubDef->getOperand(SrcIdx);
 
     if (SrcOp.isImm())
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 46757cf5fe90c..b411648af2255 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -3411,7 +3411,6 @@ void SIInstrInfo::insertSelect(MachineBasicBlock &MBB,
 bool SIInstrInfo::isFoldableCopy(const MachineInstr &MI) {
   switch (MI.getOpcode()) {
   case AMDGPU::V_MOV_B16_t16_e32:
-  case AMDGPU::V_MOV_B16_t16_e64:
   case AMDGPU::V_MOV_B32_e32:
   case AMDGPU::V_MOV_B32_e64:
   case AMDGPU::V_MOV_B64_PSEUDO:
@@ -3428,34 +3427,10 @@ bool SIInstrInfo::isFoldableCopy(const MachineInstr &MI) {
   case AMDGPU::AV_MOV_B32_IMM_PSEUDO:
   case AMDGPU::AV_MOV_B64_IMM_PSEUDO:
     return true;
-  default:
-    return false;
-  }
-}
-
-unsigned SIInstrInfo::getFoldableCopySrcIdx(const MachineInstr &MI) {
-  switch (MI.getOpcode()) {
-  case AMDGPU::V_MOV_B16_t16_e32:
   case AMDGPU::V_MOV_B16_t16_e64:
-    return 2;
-  case AMDGPU::V_MOV_B32_e32:
-  case AMDGPU::V_MOV_B32_e64:
-  case AMDGPU::V_MOV_B64_PSEUDO:
-  case AMDGPU::V_MOV_B64_e32:
-  case AMDGPU::V_MOV_B64_e64:
-  case AMDGPU::S_MOV_B32:
-  case AMDGPU::S_MOV_B64:
-  case AMDGPU::S_MOV_B64_IMM_PSEUDO:
-  case AMDGPU::COPY:
-  case AMDGPU::WWM_COPY:
-  case AMDGPU::V_ACCVGPR_WRITE_B32_e64:
-  case AMDGPU::V_ACCVGPR_READ_B32_e64:
-  case AMDGPU::V_ACCVGPR_MOV_B32:
-  case AMDGPU::AV_MOV_B32_IMM_PSEUDO:
-  case AMDGPU::AV_MOV_B64_IMM_PSEUDO:
-    return 1;
+    return !TII->hasAnyModifiersSet(MI);
   default:
-    llvm_unreachable("MI is not a foldable copy");
+    return false;
   }
 }
 
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index cc59acf1ebd94..a21089f8e0fcc 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -417,7 +417,6 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
                                   const MachineInstr &MIb) const override;
 
   static bool isFoldableCopy(const MachineInstr &MI);
-  static unsigned getFoldableCopySrcIdx(const MachineInstr &MI);
 
   void removeModOperands(MachineInstr &MI) const;
 

@broxigarchen broxigarchen force-pushed the main-fix-true16-si-fold-1 branch from 6c6ce40 to 1eb464f Compare October 6, 2025 17:12
@broxigarchen broxigarchen force-pushed the main-fix-true16-si-fold-1 branch from 1eb464f to 37ab639 Compare October 6, 2025 20:43
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test changes

@broxigarchen
Copy link
Contributor Author

broxigarchen commented Oct 7, 2025

Missing test changes

This patch is reverting some of the changes in #161764 so I think we don't need extra test here? The ir test added in the previous patch should cover it already

@broxigarchen
Copy link
Contributor Author

ping!

SubDef && TII.isFoldableCopy(*SubDef);
SubDef = MRI.getVRegDef(Sub->getReg())) {
unsigned SrcIdx = TII.getFoldableCopySrcIdx(*SubDef);
const int SrcIdx =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some unit-tests to make sure this is working as expected ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@broxigarchen broxigarchen requested a review from arsenm October 8, 2025 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants